A New Technique for Increasing the 

Flexibility of Recursive Least 

Squares Data Smoothing 

By N. LEVINE 

(Manuscript received October 19, 1960) 

A method of performing recursive least squares data smoothing is de- 
scribed in which optimum (or arbitrary) weights can be assigned to the 
observations. The usual restriction of a constant data interval can be re- 
moved without affecting the optimum weighting or recursive features. The 
method also provides an instantaneous {i.e. real time) estimate of the statis- 
tical accuracy in the smoothed coordinates for a set of arbitrary data intervals. 
Optimum gate sizes for arbitrary predictions can be determined. These 
features greatly increase the flexibility of recursive least squares data smooth- 
ing, and several applications are discussed. 

I. INTRODUCTION 

During the past few years, a need has arisen for data smoothing tech- 
niques which can be applied, in real time, to radar observations of bodies 
traveling along highly predictable trajectories. The observations are 
usually processed in digital computers, so that much effort has been 
expended in devising techniques suited to the advantages and limitations 
of computers. This paper is concerned with one such technique, recursive 
least squares smoothing, whose theoretical foundation was established 
several years ago. 1 It is our purpose to show how this technique can be 
made considerably more flexible so as to encompass a wide variety of 
practical situations while maintaining its suitability for computer use. 

By the term "recursive," we mean that a smoothed coordinate is deter- 
mined from a previously computed average of past data (one number) 
and a new observation. Thus the storage requirements are independent 
of the number of observations and arc actually quite modest. We will 
also use the term "optimum smoothing," which is to be interpreted in the 
least squares sense, i.e., the weighting of data inversely proportional to 
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their error variances. The ability to perform optimum smoothing in real 
time where the variations (not necessarily the magnitudes) in observa- 
tional error are determinable (e.g., trajectory-dependent errors) is one 
of the advantages of the method described here. Over long periods of 
observation, a significant increase in the accuracy of the smoothed co- 
ordinates may be achieved by properly weighting the data. In the actual 
derivation, however, no restrictions are placed on the weighting factors, 
thus allowing an arbitrary sequence of weights to be placed on the data. 

Sections II and III are devoted to the estimation of position and 
velocity assuming the body under observation is traveling along a straight 
line with no acceleration. In Section II we consider the case of constant 
data intervals where some emphasis has been placed on the ability to 
compensate properly for missing observations. We have also included 
formulas for determining optimum gate sizes for predicted observations, 
which are of considerable importance in some tracking systems. In Sec- 
tion III the method is extended to handle the case of arbitrary data 
intervals. 

That the restriction to linear flight does not limit the applicability of 
this method to many practical cases is shown in Section IV, where meth- 
ods for including the effects of known accelerations are discussed. In Ap- 
pendix A we show that the sum of square deviations from the least 
squares line of regression may be obtained recursively. This result may 
be useful as a means of real-time error detection or as a way of providing 
an instantaneous estimate for the average observational error during 
tracking. Appendix B describes the changes necessary in performing least 
squares recursive smoothing over a fixed number of prior observations 
— in contrast to the method in the main text, where the smoothing is 
effective over all observations. Finally, in Appendix C, the extension of 
the method to include estimation of acceleration is outlined. 

The derivation and discussion of this method of data smoothing are 
presented in the context of radar observations of moving bodies. How- 
ever, this technique can be applied to any observable quantity which can 
be expressed as a linear combination of functions where the expansion co- 
efficients are to be estimated. 

II. CONSTANT DATA INTERVALS 

The derivation of this method of performing recursive least squares 
smoothing is based on a technique employed by Kaplan. 2 For simplicity, 
we initially consider the case of straight-line unaccelerated flight. A se- 
quence of observations x { ,i = 1,2, • • • ,n,r seconds apart, are made by 
an instrument whose measurement errors are taken to be uncorrelated 
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and normally distributed with mean zero and variance at. We wish to 
obtain the least squares estimate of the position (x n ) and velocity (v n ) 
as of the nth observation* by suitably combining the latest observation 
(x») with a linear combination of the n — 1 previous observations. To do 
this, we will first minimize the sum of squared deviations from the least 
squares line of regression, R n , assuming n observations have been made. 
We may write this sum as follows: 

R,, 2 = E lb - [.?„ - (n - i)u n ]\ V- , (1) 

1=1 

where u„ = v„t and Wi is an arbitrary weighting factor. Differentiating 
this equation with respect to x„ and u„ and setting each resulting equa- 
tion equal to zero yields the normal equations: 

ll n n 

x„ 2 w i ~ a " 2 (n — l)Wj = ^ ZtW,- , 
i'=l i=l i'=l 

(2) 
— •<*,, J2 ( fl ~ i)wi + u» IZ ( n ~ iYw> = — 2Z %i( n — i)"u>i ■ 

i=l 1 = 1 1=1 

We define the sums F„ , G„ , and H„ as 

Fi, = X>.-, 

i=i 

G„ = 2 (n - i)v>i , (3) 

i=i 

i=i 
so that equations (2) take the form 

F„x„ - G H u n = 2 xWi , (4) 

1=1 

-G„.r„ + ff„w n = -Sa-,(n — *)«»*. (5) 

The simultaneous solution of (4) and (5) yields the standard least 
squares estimates of position and velocity; however, they explicitly 
involve the presence of all observations £, . To cast this in recursive form, 



Estimation to any other time (n + p)r can be included by replacing (1) by 

Y, \xt - [x„+p - (n + /> - i)f< n +p]\ 2 Wi = min 

.=1 
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we repeat the process of (1) through (5) omitting the last observation 
x„ . This yields estimates of the (n — l)st position and velocity: 



R„-i 2 = E {Sh - [xn-i - (n - 1 - i)Un-i]\ 2 Wi . (6) 



Differentiating with respect to .f„_i and t*„_i and making use of defini- 
tions (3), the resulting equations may be written in the form 



F n (x H -i + M„_l) - G n U„-i = J2 XiWi + w n (x„-i + w„_i), (7) 

-(?„(£„_, + w„_i) + ff n w„_! = -Z**(» - *>* (8) 

Subtracting (7) from (4) and (8) from (5), we have 
F n [x„ - (.f-,,-1 + Un-i)] - G„(u n - U n -l) 

= w„[x n - (.r„_i + Un-i)}, 

-G„[Xn ~ (Xn-1 + Un-l)] + H n (il n ~ M„_l) = 0, 

whose solutions are: 

x„ = (x„-i + M„_i) + a»[x„ - (.f„_i + tin-i)], (10) 

m„ = m„_i + j8„[5:„ - (x„_i + w„_i)], (11) 

where 

wJ7« 



./» ' 



(12) 



ff.==£\ (13) 

J n = F n H n - G n \ (14) 

Equations (10) and (11) explicitly indicate that a smoothed coor- 
dinate is a linear combination of "old" data with a new observation; 
a„ and /3„ are the position and velocity smoothing coefficients and are 
calculated for each new observation from (12) and (13) using definitions 
(3) and (14). 

The estimates (x n -i + u„-i) and u H -\ are merely the predicted nth 



RECURSIVE LEAST SQUARES DATA SMOOTHING 825 

position and velocity (times the data interval) based on n — 1 observa- 
tions, so that we may define 

.f„ = .r„_i + «„_! , u n = «„_! . (15) 

The smoothing equations now take the form 

■r„ = An + a,Ax n — .?„) = (1 - a n ).t n -f a,,x n , (16) 

Un = #„ + Pn(x„ - X,,) = ill,, - PJ„) + pJt H , (17) 

which show that the predicted nth position and velocity may be used 
to represent all past data. 

The ability to optimize the smoothing for varying measurement 
errors <rf arises from the fact that the quantities F„ , G„ , H„ , and ./„ 
can themselves be summed recursively: 

F„ = F„_i + w n , 

G n = Gn-i + F„_, , 

(18) 
H„ = //„_, + 2G H _! + F,^ , 

J„ = ./„_! + w H H n . 

Thus, each new observation can be arbitrarily weighted by w„ and, as 
can be shown by statistical analysis, optimumly weighted by choosing 
w„ = \/<t„". Since J„ is defined by (14), the last recursion relation above 
may be used as a consistency check. Note that a missing observation 
may be properly accounted for by choosing its weighting factor equal to 
zero and cycling the sums as usual. Equations (10) through (13) then set 
the smoothed coordinates equal to their predicted values, and the future 
weighting of both old and new data is now altered to compensate for the 
missing observation. To illustrate this, Tables I and II have been con- 
structed to show how the weighting of data changes for the case of miss- 
ing second and fifth observations. Here we have chosen w\ = 1 for sim- 
plicity. Note that the effect of the missing observations on the variance 
ratios <r Zn la^ and ai, n "/a " (see Table II) diminishes rapidly with the 
addition of new observations compared to the values of these ratios 
when all observations are present. 

The actual weighting coefficients applied to the observations to yield 
the smoothed position and velocity coordinates can be determined by 
solving (4) and (5) simultaneously. If we define 

x„ = zldXi, u„ = ^(hxi, (19) 



i-i 
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Table 


I — All Observations Present (u>, 


= 1) 




n 


i 


2 


3 


4 


5 


6 


7 


8 


F 


l 


2 


3 


4 


5 





7 


8 


G 







3 


6 


10 


15 


21 


28 


H 







5 


14 


30 


55 


91 


140 


J 







6 


20 


50 


105 


196 


336 




(1) 




5/6 


7/10 


3/5 


11/21 


13/28 


5/12 





(0) 




1/2 


3/10 


1/5 


3/21 


3/28 


1/12 


<n*/v 2 


1 




0.833 


0.700 


0.600 


0.524 


0.464 


0.417 


a-.W 




2 


0.500 


0.200 


0.100 


0.057 


0.036 


0.024 




Xi 


•r 2 


£3 


x* 


h 


3-6 


X-j 


x s 


X. 


1 




1 














-?3 


-1/6 


2/6 


5/6 












X* 


-2/10 


1/10 


4/10 


7/10 










x 5 


-1/5 





1/5 


2/5 


3/5 








Xtt 


-4/21 


-1/21 


2/21 


5/21 


8/21 


11/21 








-5/28 


-2/28 


1/28 


4/2S 


7/28 


10/28 


13/2S 




X S 


-2/12 


-1/12 





1/12 


2/12 


3/12 


4/12 


5/12 


Hi 


-1 


1 














us 


-1/2 





1/2 












(7 4 


-3/10 


-1/10 


1/10 


3/10 










(7& 


-2/10 


-1/10 





1/10 


2/10 










-5/35 


-3/35 


-1/35 


1/35 


3/35 


5/35 








-3/28 


-2/28 


-1/28 





1/28 


2/28 


3/28 




Si 


-7/84 


-5/84 


-3/84 


-1/84 


1/84 


3/84 


5/84 


7/84 



then the .solution of (4) and (5) yields: 



I 



c . = * [H n - (n- i)G n ]wi , 



tU = f [G n 



(n — i)F„]wi. 



(20) 



(21) 



Those coefficients are tabulated in Tables I and II for the special cases 
considered. 

In digital computer applications, this method would require storage 
of f„ , Un , F n -i , Gn-i , H n -, (and ./„_i , if desired). Upon receipt of 
a-,, , w n is determined; the sums are updated, (18); a,, and /3„ are com- 
puted, (12) and (13); x„ and u„ are determined, (16) and (17); and 
x, l+ i and tin+i are formed, (16), and stored with the current values of 
the sums. The amount of storage and computation per cycle is inde- 
pendent of the number of observations. 

In situations where this type of smoothing can be employed, the 
method outlined here offers several advantages over conventional 
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Table II- 


- Observations 


2 and 5 Missed 




H 


1 


2 


.? 


4 


5 


6 


7 


8 


F 


1 


1 


■> 


3 


3 


4 


5 





G 


o 


1 


•> 


4 


7 


10 


14 


19 


II 


(J 


1 


4 


10 


21 


38 


62 


95 


J 








4 


14 


14 


52 114 


209 


a 


(i) 





1 


5/7 





19/20 31/57 


95/209 





((). 





1/2 


2/7 





5/20 7/57 


19/209 


nW 


1 




1 


0.714 


1.500 


0.741 0.544 


0.455 


<T,?W 






0.500 


0.214 


0.214 


0.077 0.044 


0.029 




•h 


x.. 


h 


&4 


h 


X 6 X 7 


i- 8 


Jl 


1 
















•T.T 







1 












Xa 


-1/7 




3/7 


5/7 










J':, 


-1/2 




1/2 


1 










*c, 


-6/26 




4/20 


9/20 




19/26 






•r? 


-11/57 




3/57 


10/57 




25/57 


31/57 




•?» 


-38/209 







19/209 




57/209 


7(5/209 


95/209 




-1/2 




1/2 












I'd 


-5/14 




1/14 


4/14 










ft-, 


-5/14 




1/14 


4/14 










ft a 


-5/20 




-1/20 


1/26 




5/20 






II- 


-10/114 




-0/114 


-1/114 




9/114 


14/114 




Uh 


-23/209 




-11/209 


-5/209 




7/209 


13/209 


19/209 



methods. For example, the weighting factors w t can be arbitrarily chosen 
without affecting the exactness condition; i.e., in the absence of errors, 
true straight-line data are unaltered by the smoothing process. Thus, 
the Wi's can be chosen to place greater (or lesser) emphasis on new data 
by increasing (or decreasing) their relative weight with respect to old 
data. If the measurement errors (of) vary in a known way with posi- 
tion, velocity, or time, the Wi can be chosen equal to 1/<t, so as to op- 
timize the smoothing for these errors. It is important to note here that 
optimum smoothing requires only a knowledge of the ratio of the errors, 
not their magnitudes. Of course, the magnitudes must be known in order 
to evaluate the statistical accuracy of the smoothed coordinates. 

The presence 1 of the sums makes it possible to determine the statistical 
accuracy in the smoothed coordinates at any time (assuming uncorre- 
lated Gaussian measurement errors) for two important cases: 

(a) the weighting factors arc chosen equal to A' a,", where K is a 
normalization constant ; 

( b ) 0-," = <T " (a constant for all i) and u>, = 1 or 0. 
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In both cases, we may write 

n n 

= EftVi <r S „ 2 = 22di 2 <Ti 2 , (22) 



2 



which, by making use of (3) and (14), can be evaluated as 



0i 



2 



= K^, (23) 



Jn ' 

KF\\ 

T 2 J n / 



<7 Un 2 = A^ (or<r s „^ : ', ) 

for case (a), and K replaced with o- 2 for case (b). 

The accuracy of an arbitrary prediction of p units (pr seconds) into 
the past or future may be determined from 



2 

Or., , „ 



= £ (« + pd<)W, 

(25) 
= j [H n + 2pG n + pXL 

Equation (25) is of great value in certain tracking systems where the 
(n + l)st observation is "captured" by making a prediction, £ n+i of 
x n+ i and surrounding it with a gate within which the observation must 
fall. The optimum gate size is obtained by adding the variances of the 
(n + l)st observation and the prediction. Thus, using (25) with p = 1, 
we find: 



optimum 1-sigma gate = j 



o- n+ i 2 + A' -p- 1 , case (a), 

1 (26) 

v|/h^, case (b). 



The gate size determined from (26) is automatically increased when 
observations are missed, and is optimum for any sequence of observa- 
tions and misses. 

For completeness, we present the explicit value of all quantities after n 
observations for the case of all to,- = 1 : 

F„ = n, 

„ n(n — 1) 
Cr« = K » 
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n(n - 1)(2» - 1) 



H„ = 



J n = 



fin = 



C; = 



d t = 



6 

2/2 i \ 

n \n —I) 
~~ 12 ' 

2(2n - 1 ) 
n(n + 1) ' 

_0 

n(n + 1) ' 

2[(2n - 1) - 3(n - i)] 
n(n + 1) 

6[(n - 1) - 2(n - Q] 
(n - l)n(n + 1) 



2 2 2(2n - 1) 2 

n(n + 1) 



2 0"o 12 

07, .. 



t 2 (n - l)n(n + 1) ' 



2 = „ 2 (n - l)(2n - 1) + 6p(n - 1) + jjp' 
<7 - r " +p -<T ° " (n - l)n(n + 1) 



optimum 



i • + . /(» + 1)(» + 2) 
1 -.sigma gate = *„ i/ - ' /v ' — ^ . 



III. VARIABLE DATA INTERVALS 

111 many cases of interest, the time intervals between observations 
may vary over wide limits. The results of the previous section can easily 
be extended to include the case of arbitrary observation times l { by 
replacing the quantity (n — i) by (/„ — h) and replacing u n by v n . 
The sums G„ and H„ are redefined as 

G n = Z On ~ l)Wi , 
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where F n is unchanged, (3), and the recursion relations become 

F„ = Fn-! + w„ , 

G„ = G„_l + (l„ - tn-l)F n -l, 

II„ = //„_, + 2(1 - L_i)G n _x + (l„ - ?„-i)X-i (28) 

= //„_! + (l„ - L-l)(G n + G„-l), 
J„ = Jn-l + W n H n . 

The smoothing equations retain the same form, except that the 
smoothing coefficient /3„ now implicitly contains the time dependence 

f,„ = r.„ + £„(£„ - .r„). (29) 

The predicted (n + l)st observation, x n+ i , is now computed from 

X ll+1 = X n + (l n+1 - ln)Vn (30) 

where, obviously, one must either know l„+i before receipt of the (n + l)st 
observation or defer computation of x n+ i until after the (n + l)st observa- 
tion is made. For those systems which must predict the (n + l)st ob- 
servation in order to "capture" it, one must estimate the time of ob- 
servation (L+i) and apply an appropriate gate about x n+ i = x„ + 
(i ll+ i — l n )v n , large enough to account for all sources of error. No 
degradation in the quality of the fit is made if only a poor estimate of 
t, l+1 can be obtained, since this is merely a device for capturing x n+1 . 
Once the observation has been received, x n+ x can be corrected to yield 
.r„+i as follows: 

x n+1 = x ll+l ' + (?„+i " L +l )v„ +] . (31) 

In the preceding section, we described how a missing observation 
could be properly accounted for. In the case of variable data intervals, 
only the time difference between successive observations enters the 
equations. Thus, an observation that was anticipated (x H+ i at l n +i) but 
never made (w„+i = 0) has no effect on the equation. The next observa- 
tion is predicted at time l n+i , i.e., 

.r„+2 = 4+1 + (ln+Z - i l+ l)Vn+l 

= x H + (t n+2 - l n )v n , 

and the only quantity entering the equations is (?„ +2 — L). 

The determination of the optimum gate size is now somewhat more 
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complicated than before. In general, i,,+\ is determined from some func- 
tion of present position and velocity, T(x„ , v a ), so that 

x„+i = .?„ + r„T(.r„ , r„). 

In the approximation for small estimation errors, cr* ,' can be deter- 
mined from* 

*i* = «=; (^Y + &* „ (*&) (**=*) + ., * (?*^ 



where, using ( 19) through (22) and definition ( 14), the covariance may 
be evaluated as 

ViuVn = Jl Cdi<Ti = K -T 
i-l J n 

and the gate size becomes 

optimum 1-sigma gate = \/<r 2 4- a- 2 

For most practical purposes, an adequate approximation to the op- 
timum gate size can be made by first estimating t ll+ i and then evaluating 

# n+1 = H„ + 2(/, (+1 - l R )G n + (L+i - l)-F n , (34) 

which is inserted in (20). 

Note that for this case of variable data intervals, the quantities H„/J n 
and F„/J„ still determine the instantaneous position and velocity vari- 
ances, respectively, and (25) (with p now equal to the prediction time 
in seconds) determines the statistical accuracy of an arbitrary extrapola- 
tion or interpolation. 

IV. KNOWN ACCELERATION 

The previous sections have dealt with the case of zero acceleration, 
which is but a special case of motion with known acceleration. For the 
case of known and constant acceleration a, all the preceding results apply 
if one modification of the smoothing equation is made. The predicted 
( /( + 1 )st position and velocity should be determined as follows: 

;f\,+i = x„ + r„(l, +] - /„) + ha(l„ +1 - l n )\ 

( 35 ) 

''»+i = v„ + a(( ll+ i - i„) 

with (?„+i — t„) = t for the case of constant data intervals. 
* Sop, for example, Ref. 3, p. 51. 
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If the acceleration is a known function of the position and velocity 
a(x, v), one could treat this case as above by evaluating a(x, v) at (x„ , v n ) 
and using (35). This procedure works very well if a(x, v) is not a sensi- 
tive function of position and velocity and the time interval between 
observations is not too large. Possible criteria for these restrictions for 
the case of constant data intervals are 
■• 

I- [a(X n ± <T- Xn , V n ± 07. n ) - a(*n ,Vn)] « ffxn +1 

or 

r[a(x„ ± a- Xn , v n ± a- Vn ) - a(x„ , v n )] « a ( . n+l , 

which express the fact that the variation in the acceleration due to the 
errors in the smoothed data should not contribute significantly to the 
error in predicted position or velocity, lor acceleration functions or 
smoothing times [i.e., (n - l)r] which do not satisfy the above criteria, 
other methods must be employed to optimize the smoothing. 

In order to assess the systematic error in smoothed position and 
velocity due to a varying component of acceleration not compensated 
for by the method outlined above, one can replace the varying component 
by some average value. Then it is easy to show that, for a constant 
acceleration a, the differences between true position and velocity and the 
linearly smoothed values are 

(n - l)(n - 2) /oA 

V " 7 (36) 

T - H *■ I \ 
Vn ~ V n = 7y ( - flT ' '■ 

It is obvious that this method of performing least squares smoothing 
can be extended to the case of an unknown but constant acceleration 
which is a special case of motion with known "jerk" (rate of change of 
acceleration). The derivation and some results for this case are given in 
Appendix C, and a comparison with the estimation errors for linear 
smoothing is made. 
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APPENDIX A 

Recursive Summation of Squared Residuals* 

In connection with a study of methods to detect errors in the smooth- 
ing of data by the technique described in the main text, it was found that 
the sum of squared deviations from the least squares line of regression 
could be determined recursively. Insofar as this may be of some interest 
in certain data processing systems, we will briefly outline the derivation 
and present the results in this appendix. 

The sum of squared deviations from the least squares line of regression 
(hereafter called squared residuals) is defined by (1), which we repeat 
here : 

R,r = Z [xi - x„ + (n - i)u n fwi . (37) 

1=1 

The recursion relation may be obtained by replacing all n's by (n + 1 )'s 
in (37): 

R /l+ i 2 = E % - x n+1 + (n + 1 - i)u n+J f Wi , (38) 

i=i 

which can be immediately converted to: 

#»+i 2 = iL [Xi - •i'/.+i + ( n + 1 - i)u, l+1 ] 2 Wi + w n+ i(x n+ i — x n+l ) 2 . 
i=i 

By making use of the smoothing and prediction equations, (10), (11), 
and (15), we can rewrite the last result as 

n 

R»+i 2 = H ([Xi - x„ + (n - i)u n \ 
1=1 

- j(.f„ +1 - x a+l )[a n+1 - (n + 1 - ;)/?„ +1 )]))V (39) 

+ u>„ +1 (:r„ + i — x„+i) . 

The evaluation of this sum is straightforward but tedious and makes 



* See also Kef. 7. 
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use of the denning relations for the quantities F„ , G„ , H„ , J„ , <x„ , and 
ft, , (3), (12), (13), (14), and the recursion relations, (18). The result 
may be written in either of two forms: 

Rn+l 2 = Rn + 1 " (a"n+l — #n+l) 

1 - ««+i (40) 

= R» 2 + w n+ i(l - a H+1 )(x ll+1 - x ll+i ) 2 . 

Thus only one storage slot and a modest amount of computation need 
be invested to obtain the instantaneous value of the squared residuals. 
In order to use this as an error-detecting device, some assumptions 
regarding the observational error must be made. If the variations in the 
o-, 2 are known and compensated for by choosing w„ = K/o„~, and if the 
errors are uncorrected between observations, then the average value of 
R,, 2 can be shown to be 

ave (R n 2 ) = K[( number of observations) — 2]. (41) 

If the <t," = da and w, = 1 or 0, then 

ave (R n 2 ) = a 2 (F n - 2) (42) 

since, for this case, F n is equal to the number of observations. The fact 
that the average value is proportional to two less than the number of 
observations is related to the fact that two degrees of freedom have been 
used up in determining x n and u„ . If the variations in <r, are unknown or 
too complicated to compensate for, then it is often possible, for a given 
system, to determine some bounds on the average value of squared 
residuals. 
If we define 



Rn' = 



Rn 2 



jiumber of observations) — 2 

then, from above we have 

[K for Wi = K/af, 
ave (i?„ 2 ) = { (43) 

((To for Wi = 1 or 0. 

It can be shown from a straightforward application of statistical analy- 
sis* that 

2K 2 

var (R n 2 ) = 7 r TT 7- ^ o (^) 

(number ot observations; — 2 

with K 2 replaced by o- for the case of Wi = 1 or 0. 
* See, for example, Ref. 3, p. 103. 
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If no prior knowledge of the magnitude of the measurement errors is 
available, the squared residuals can be used to obtain an estimate of the 
average measurement error if all w, = 1, or of the average effective meas- 
urement error if an arbitrary weighting sequence is used. 

APPENDIX B 

Fixed Memory Smoothing 

In contrast to the method described in the main text, where the 
smoothing is designed to include all observations (i.e., variable smooth- 
ing time), we can set up a smoothing procedure which fits a least 
squares line of regression to the latest r observations. The ability to 
optimize for varying observational errors and missing observations is 
retained, but we must provide storage for the /• observations. 

The two quantities to be minimized may be written as 



E I-' - . - •(■"" + (n - i)unfwi , 

E [%i - •'■« + <" ■ — iWnflVi . 

We define the sums F„ , G„ , H„ , and J ,, as follows: 
F n = E Wi, 

l=n+l-r 

G„ = E (n — i)wj, 
H„ = E (« - OV, 

i—n+l—r 

./„ = f„h„ - g,;\ 

and their recursion relations are 
F„ = F,,-! + w„ — w n - r , 

G„ = n -! + /•'„-, - rW„-r , 

fl„ = //„_, + 2G„_, + F„_i - r"to._ , 

./„ = ./„_, + w n H„ - u>„_ r [/7„_, - 2(r - !)£„_, + (r - l) 2 F„-i]. 
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The smoothing equations can be shown to be 

:f„ = .f„ + a n (x„ — x„) + y n [x„-r — (•*'» ~ ru„)], 

U„ = U„ + P„(X„ — i,.) + S n [Xn-r ~ (&n ~ >'#«)], 



(48) 



where 



w n H„ w n - r (rG n - H n ) 



«n = r , 7n = 



Jn ' J„ 

(49) 
w n G„ w n _ r (rF n - G„) 

" = ~77 ' 8 " = T a ' 

and the prediction equations are the same as (15). 

The quantities (rG n — H„) and (rF„ — G n ) may also be written re- 
cursively, for if we define 

A n = rF n - G„ , 

B„ = rG„ - H n , 

then 

A„ = A n -l — F„_i -f- fWn-r , 

B H = B„_ x + A,,.! - ((?„_! + F„_i). 

The explicit appearance of x„_ r in (48) demonstrates the fact that 
storage must be provided for the last r observations. As in the method 
described in the main text, the w, can be chosen completely arbitrarily 
and if they arc chosen equal to 1 a,,', the smoothing is optimized for the 
varying data error. It may be advantageous to provide storage for the r 
weighting factors Wt if they are computed from some function. This would 
remove the necessity for recomputing w n - r when it had already been 
computed at a previous n' = n — r. 

In terms of computer operation, the time required to process each 
new observation is independent of r. Thus significant savings in com- 
putation time over other methods (stored coefficients, cascaded simple 
sums) are achieved only when r is large. However, these other methods 
smooth data in a predetermined manner and cannot alter the weighting 
of data to compensate for an arbitrary sequence of observations and 
misses or optimize for trajectory-dependent measurement errors. 

For the case of all m>, = 1, the explicit values of all quantities are 
given at the end of Section II of the main text, with n replaced by /•. 
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APPENDIX C 



Estimation of Acceleration 

To distinguish between smoothing over observations of linear and 
parabolic flight, we shall refer to the former as linear smoothing and the 
latter as quadratic smoothing. For the latter, we want to minimize 

£ [it - [Sn - (n - i)u n + (n - t) 2 s n ]}V , (50) 

1=1 

where 

u„ = V n T, § n = \a„r . 

Straightforward differentiation of this expression with respect to the 
three unknowns yields the normal equations: 

n 

F n x„ - G H u„ + H n s„ = XI zjWi , 

n 

-O n Xn + H n U„ - /,,% = -E^(n - i)wi, (51) 

i=l 

H n x„ — I„U„ + K„s n = 5Z Xi(n — i) 2 Wi , 
i=i 

where F n , G„ , and H„ are defined by (3) and 

In = 2 ( n ~ i)*V>i . 

K n = zl (n — i) 4 Wi . 
i=i 

The recursion relations for /„ and K„ are 

I n = /„_, + Mn-l + 3G„_i + F„_, , 

K„ = K n -1 + 4/„_, + GH n -l + 4G„_1 + G»_, , 

and, as before, ./„ is defined as the determinant of the coefficients in the 
normal equations: 

./„ = F„H n K» - 2G„HnI« - H„ s - FJ,;- - K„G,r. (53) 

Repeating the procedure for the estimates .f„_i , w„_i , and s„_i and sub- 
tracting the resulting equations from the corresponding equations in 
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(51), we obtain the three-variable equivalent to (9). The solution for 
quadratic smoothing can be written 

x„ = x„ + a n (x„ - x„), 

tin = tin + &n{Xn ~ X,,), (54) 

s„ = s„ + y H {x n — .f„), 



where 



H n K„ - I,' 











LA ,, 






Jn 


1 










fin 


= w„ 


G„K, 


•Jn 


HJ n 










In 


= w n 


GJ„ 


— 


H,r 




■ 


/„ 




The 


auxi 


liary 


predict 


ion eq 


uations are 














X ll+ i 


= x„ 


+ u„ 


+ 


s„ , 










d„+i 


= u„ 


+ 2.s 


" J 












§n+l 


= s„ 


, 







(55) 



(56) 



where additional terms may be added to compensate for known "jerk." 
If, in addition to the definitions (19), we define 

n 

s n = 2 «A » 

1=1 

the weighting coefficients c, , rf, , and c, applied to the observations to 
yield smoothed position, velocity and acceleration are: 

d - j [(H„K n - I,, 2 ) - (G n K n - HJ„)(n - i) 

+ «?„/„ - H,;)(n - ?")'->,, 

di - 1 \(G n K n - HJ„) - (F„K„ - H,; 2 )(n - i) 

•'" (57) 

+ (FJ n - G n H n )(n - i) 2 ] Wi , 
d = J- [(0 J. - H n 2 ) - (FJ n - G n H„)(n - i) 
4- (F n H„ - G,r)(n - i)*\wi. 
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The variance ratios for smoothed position, velocity, and acceleration 
(assuming constant observational error) can be evaluated as 

2/ 2 H„K n — I n 

<Tx n / 0"0 — J > 

2/ o F„K n — H n ~ /,.£., 

o"u„/o"o = j > yds) 

2/ 2 F,g, - g: 

•' n 

For all Wi = 1, these quantities can be explicitly evaluated as func- 
tions of n: 

F„ = n , 

n(n - 1) 



G n = 
H n = 

In = 
Kn = 



2 

n(n - l)(2n - 1) 
6 

n{n — 1)" 
4 ' 

n(n - l)(2ro - l)(3/i 2 - 3n - 1) 
30 



= n\n - l)\n + l) 2 (n - 2)(n + 2) 
2160 

3(3n 2 - 3rc + 2) 



0n = 



y« = 



n(n + 


D(n 


+ 2)' 


18(2n - 


1) 


n(n + 


1)(« 
30 


■ + 2)' 



»(n+ l)(» + 2)' 



., (3ra 2 - 3/t + 2) - 6(2n - l)(n - t) + 10(n - *) s 
cv = 3 



n(n+l)(» + 2) 

3(2n - l)(n - 1)U - 2) - 2(8w - ll)(2n - l)(n - 

. + 30(n - l)(w - s 

'" ~ () ~ (n - 2)(n - l)n(n + l)(n + 2) 
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(n - l)(n - 2) - G(n - l)(w - i) + 6(n - i) 2 



a = 30 



2 
= <T{) 



(n - 2)(n - \)n{n + \){n + 2) 
"3(3m 2 - 3n + 2)' 



2 
= «nO"0 , 



2 00 

ff 5» = -? 



_n(w + l)(n + 2)_ 

j-oT 12(8n - 11) (2n - 1) 1 

r 2 L(^ - 2)(w - l)w(n + l)(n + 2)J' 



2 

ca.. — 



4_roT 180 1 

r 4 L(» - 2)(n - l)n(» + l)(n + 2)J" 



It is interesting to compare the position and velocity variance ratios 
for linear and quadratic, smoothing: 

position variance (quadratic) 3 (3m — Sn + 2) 9 



position variance (linear) 2(n + 2)(2w — 1) 4 

velocity variance (quadratic) _ (8n — 11) (2n — 1) 
velocity variance (linear) (n — 2)(n + 2) 



- for large n , 



16 for large n . 



These ratios indicate the cost in position and velocity accuracy for the 
inclusion of acceleration estimation. 
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